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Abstract —We propose a new method for trajectory planning 
to solve the data harvesting problem. In a two-dimensional 
mission space, N mobile agents are tasked with the collection 
of data generated at M stationary sources and delivery to a 
base aiming at minimizing expected delays. An optimal control 
formulation of this problem provides some initial insights 
regarding its solution, but it is computationally intractable, 
especially in the case where the data generating processes are 
stochastic. We propose an agent trajectory parameterization in 
terms of general function families which can be subsequently 
optimized on line through the use of Infinitesimal Perturbation 
Analysis (IPA). Explicit results are provided for the case of 
elliptical and Fourier series trajectories and some properties of 
the solution are identified, incfuding robustness with respect to 
the data generation processes and scalabiiity in the size of an 
event set characterizing the underlying hybrid dynamic system. 


I. Introduction 

Systems consisting of cooperating mobile agents are being 
continuously developed for a broad spectrum of applications 
such as environmental sampling [1],[2], surveillance [3], 
coverage control [4],[5],[6], persistent monitoring [7],[8], 
task assignment [9], and data harvesting and information 
collection [10],[11],[12]. The data harvesting problem arises 
in many settings, including “smart cities” where wireless 
sensor networks (WSNs) are being widely deployed for pur¬ 
poses of monitoring the environment, traffic, infrastructure 
for transportation and for energy distribution, surveillance, 
and a variety of other specialized purposes [13]. Although 
many efforts focus on the analysis of the vast amount of 
data gathered, we must first ensure the existence of robust 
means to collect all data in a timely fashion when the size 
of the sensor networks and the level of node interference do 
not allow for a fully wireless connected system. Sensors can 
locally gather and buffer data, while mobile elements (e.g., 
vehicles, aerial drones) retrieve the data from each part of 
the network. Similarly, mobile elements may themselves be 
equipped with sensors and visit specific points of interest 
to collect data which must then be delivered to a given 
base. These mobile agents should follow an optimal path 
(in some sense to be defined) which allows visiting each 
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data source frequently enough and within the constraints of 
a given environment like that of an urban setting. 

The data harvesting problem using mobile agents known 
as “message ferries” or “data mules” has been considered 
from several different perspectives. For a survey on different 
routing problems in WSNs see [14],[15] and references 
therein. In [16] algorithms are proposed for patrolling target 
points with the goal of balanced time intervals between 
consecutive visits. A weighted version of the algorithm 
improves the performance in cases with unequally valued 
targets. However, in this scenario the data need not be 
delivered to a base and visits to a recharging station are only 
necessary if the data mules are running out of energy. In [11] 
the problem is viewed as a polling system with a mobile 
server visiting data queues at fixed targets. Trajectories are 
designed for the mobile server in order to stabilize the 
system, keeping queue contents (modeled as fluid queues) 
uniformly bounded. 

In this paper, we consider the data harvesting problem as 
an optimal control problem for a team of multiple cooperat¬ 
ing mobile agents responsible for collecting data generated 
by arbitrary random processes at fixed target points and 
delivering these data to a base. The ultimate goal is for the 
data to be collected and delivered with minimum expected 
delay. Rather than looking at this problem as a scheduling 
task where visit times for each target are determined as¬ 
suming agents only move in straight lines between targets, 
we aim to optimize a two-dimensional trajectory for each 
agent, which may be periodic and can collect data from 
a target once the agent is within a given range from that 
target. Interestingly, the setting of the problem can also be 
viewed as an evacuation process where visits are needed to 
retrieve individuals from a set of target points which may be 
of non-uniform importance. In this paper, we limit ourselves 
to trajectories with no constraints due to obstacles or other 
factors. Clearly, in an urban environment this is generally 
not the case and the set of admissible trajectories will have 
to be restricted in subsequent work. 

We formulate a finite-horizon optimal control problem 
in which the underlying dynamic system has hybrid (time- 
driven and event-driven) dynamics. We note that the speci¬ 
fication of an appropriate objective function is nontrivial for 
the data harvesting problem, largely due to the fact that the 
agents act as mobile servers for the data sources and have 
their own dynamics. Since the control is applied to the mo- 


tion of agents, the objective function must capture the agent 
behavior in addition to that of the data queues at the targets, 
the agents, and the base. The solution of this optimal control 
problem (even in the deterministic case) requires a Two Point 
Boundary Value Problem (TPBVP) numerical solver which 
is clearly not suited for on-line operation and yields only 
locally optimal solutions. Thus, the main contribution of 
the paper is to formulate and solve an optimal parametric 
agent trajectory problem. In particular, similar to the idea 
in [17] we represent an agent trajectory in terms of general 
function families characterized by a set of parameters that we 
seek to optimize, given an objective function. We consider 
elliptical trajectories as well as the much richer set of Fourier 
series trajectory representations. We then show that we can 
make use of Infinitesimal Perturbation Analysis (IPA) for 
hybrid systems [18] to determine gradients of the objective 
function with respect to these parameters and subsequently 
obtain (at least locally) optimal trajectories. This approach 
also allows us to exploit ( i) robustness properties of IPA to 
allow stochastic data generation processes, (ii) the event- 
driven nature of the IPA gradient estimation process which 
is scalable in the event set of the underlying hybrid dynamic 
system, and (Hi) the on-line computation which implies that 
trajectories adjust as operating conditions change (e.g., new 
targets). 

In section [TT| we present an optimal control formulation 
for the data harvesting problem. In section [HI] we provide 
a Hamiltonian analysis leading to a TPBVP. In section [IV] 
we formulate the alternative problem of determining optimal 
trajectories based on general function representations and 
provide solutions using a gradient-based algorithm using 
IPA for two particular function families. Sections [V] and 
[VI] present the numerical results and the conclusions respec¬ 
tively. 

II. Problem Formulation 

We consider a data harvesting problem where N mo¬ 
bile agents collect data from M stationary targets in a 
two-dimensional rectangular mission space S = [0,Li] x 
[ 0 ,^ 2 ] C R 2 . Each agent may visit one or more of the M 
targets, collect data from them, and deliver them to a base. It 
then continues visiting targets, possibly the same as before 
or new ones, and repeats this process. By cooperating in 
how data are collected and delivered, the objective of the 
agent team is to minimize a weighted sum of collection and 
delivery delays over all targets. 

Let Sj(t ) = [sj(t), sj(i)] be the position of agent j at 
time t with s*(i) € [0, L\\ and s v 3 (t) G [ 0 ,^ 2 ]- The position 
of the agent follows single integrator dynamics: 

Sj(t) = uj(t) cos 9j(t), Sj (t ) = uj (t ) s in 6j(t ) (1) 

where Uj(t) is the scalar speed of the agent (normalized 
so that 0 < Uj(t) < 1) and 9j(t ) is the angle relative to 
the positive direction, 0 < 9j(t) < 2n. Thus, we assume 
that the agent controls its orientation and speed. An agent is 
represented as a particle, so that we will omit the need for 
any collision avoidance control. The agent dynamics above 
could be more complicated without affecting the essence of 
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Fig. 1. Data harvesting queueing model for M targets and N agents 
our analysis, but we will limit ourselves here to 0. 

Consider a set of data sources as points Wi G S, i = 

I with associated ranges r^, so that agent j can 
collect data from Wi only if the Euclidean distance Dy ( t ) = 

II Wi — s j (f) 11 satisfies Dy(t) < r t j. Similarly, there is a base 
at w B G S which receives all data collected by the agents. 
An agent can only deliver data to the base if the Euclidean 
distance D Bj (t) = || w Bj — Sj(t)\\ satisfies D B (t) < rsj- We 
define a function Py (t) to be the normalized data collection 
rate from target i when the agent is at Sj (t): 

Pij(t) = p(wi,Sj(t)) (2) 

and we assume that: (Al) it is monotonically non-increasing 

in the value of Dij(t) = ||Kjj — Sj(i)||, and (A2) it 

satisfies Pij(t) = 0 if Dij(t ) > ry-. Thus, Pij(t) can 
model communication power constraints which depend on 
the distance between a data source and an agent equipped 
with a receiver (similar to the model used in [11]) or sensing 
range constraints if an agent collects data using on-board 
sensors. For simplicity, we will also assume that: (A3) 
Pij(t) is continuous in Dy(f). Similarly, we define: 

= P{ W B’ S j( t )) ( 3 ) 

The data harvesting problem described above can be viewed 
as a polling system where mobile agents are serving the 
targets by collecting data and delivering them to the base. 
Figure [J shows a queueing system in which each P t j (t ) is 
depicted as a switch activated when Dy(t) < r.y to capture 
the finite range between agent j and target i. All queues 
are modeled as flow systems whose dynamics are given next 
(however, as we will see, the agent trajectory optimization 
is driven by events observed in the underlying system where 
queues contain discrete data packets so that this modeling 
device has minimal effect on our analysis). As seen in Fig. 
[U there are three sets of queues. The first set includes the data 
contents Xi(t) G R + at each target i = 1, ...,M where we 
use <Tj(t) as the instantaneous inflow rate. In general, we treat 
{c Ti(t )} as a random process assumed only to be piecewise 
continuous; we will treat it as a deterministic constant only 
for the Hamiltonian analysis in the next section. Thus, at 
time t, Xi(t) is a random variable resulting from the random 
process {cu(f)}. The second set of queues consists of data 
contents Z l3 ( t ) G K + onboard agent j collected from target 
i as long as Pi 3 (t) > 0. The last set consists of queues 
Yi(t) G R + containing data at the base, one queue for each 
target, delivered by some agent j as long as P B . ( t ) > 0. 
Note that (Ai(f)}, {Zi 3 (t)} and {Tj(f)} are also random 
processes and the same applies to the agent states {sj(t)}, 








j = 1,... ,N, since the controls are generally dependent on 
the random queue states. Thus, we ensure that all random 
processes are defined on a common probability space. The 
maximum rate of data collection from target i by agent j 
is Hij and the actual rate is 1-HjPij it) if j is connected to 
i. We will assume that: (A4) only one agent at a time is 
connected to a target i even if there are other agents l with 
I\i (t) > 0; this is not the only possible model, but we adopt 
it based on the premise that simultaneous downloading of 
packets from a common source creates problems of proper 
data reconstruction at the base. The dynamics of X, : (t), 
assuming that agent j is connected to it, are 


Xi{t) = 


0 if Xi(t) = 0 and cr^i) < 

<Ji(t) — otherwise 

(4) 

Obviously, Xi(t) = <Ji(t) if Pij(t) = 0, j = 1,..., N. In 
order to express the dynamics of Zjjit), let 

if Xi(t) = 0 and Pij(t) > 0 
otherwise 

(5) 
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This gives us the dynamics: 

~ ,.s Jo if Zij(t) = 0 and jiij{t)Pij(t) - PijP Bj {t) < 0 
Zy(t)= \ ikjmM-PiMt) otherwise 

( 6 ) 

where fyj is the maximum rate of data from target i delivered 
by agent j. For simplicity, we assume that: (A5) \\wi — 
wb\\ > r ij+ r Bj for all i = 1,..., M and j = 1,..., N, i.e., 
the agent cannot collect and deliver data at the same time. 
Therefore, in © it is always the case that Pij{t)PBj(t) = 0. 
Finally, the dynamics of Y, (t) depend on Zij(t), the content 
of the on-board queue of each agent j from target i as long as 
P Bj {t) > 0. We define ft(f) = Ef=t PijPsj (*) 1 [Zij (*) > 
0] to be the total instantaneous delivery rate for target i data, 
so that the dynamics of Y,(f) are: 

Yi(t)=0i(t) (7) 


Our objective is to maintain minimal values for all target and 
on-board agent data queues, while maximizing the contents 
of the delivered data at the base queues. Thus, we define 
..., Xmi t) to be the weighted sum of expected target 
queue contents (recalling that {<Ji(t)} are random processes): 

M 

Ji(x u ... Xm, t) = J2 <b E l x M (8) 

i= 1 

where the weight qi represents the importance factor of target 
i. Similarly, we define a weighted sum of expected base 
queues contents: 

M 

MY U ... Ym, t) = J2 <h E \YM (9) 

i =1 

For simplicity, we will in the sequel assume that qi = 1 for 
all i without affecting any aspect of our analysis. Therefore, 
our optimization objective may be a convex combination of 
© and ©. In addition, we need to ensure that the agents are 
controlled so as to maximize their utilization, i.e., the fraction 
of time spent performing a useful task by being within range 
of a target or the base. Equivalently, we aim to minimize the 
non-productive idling time of each agent during which it is 


not visiting any target or the base. Let 

D±(t) = max{0, D i:j {t) - nj), D+(t) = max(0, D B .(t) - r B .) 

(10) 

so that the idling time for agent j occurs when Dfj{t) > 0 
for all % and D~^At) > 0. We define the idling function 

M \ 

l + D n j ( t )Ii D ^ t n (H) 

This function has the following properties. First, Ij{t) = 0 if 
and only if the product term inside the bracket is zero, i.e., 
agent j is visiting a target or the base; otherwise, Ij ( t) > 0. 
Second, I jit) is monotonically nondecreasing in the number 
of targets M. The logarithmic function is selected so as to 
prevent the value of I jit) from dominating those of Ji(-) 
and J 2 (•) when included in a single objective function. We 
define: 

N 

Mt) = ( 12 ) 

i=t 

where Mj is a weight for the idling time effect relative to 
Ji(-) and T 2 (■). Note that Ijit) is also a random variable 
since it is a function of the agent states Sj(i), j = 1,..., N. 
Finally, we define a terminal cost at T capturing the expected 
value of the amount of data left on board the agents, noting 
that the effect of this term vanishes as T goes to infinity as 
long as all E[Zij(T)] remain bounded: 

M N 

J /( T )=yEE^( r )] (!3) 

*=1 3 = 1 

We can now formulate a stochastic optimization problem PI 
where the control variables are the agent speeds and headings 
denoted by the vectors u(f) = [mi(£), ..., Mjv(f)] and 0{t) = 
[0i {t ),..., 9Nit)] respectively (omitting their dependence on 
the full system state at t). We combine the objective function 
components in ®, ©, d~i~2l) and (fl3l) to obtain: 

u(™6Ht) J ( T ) = ^ fo ( qJi W - (! - a )Mt) + M*)) + J f ( T ) 

(14) 

where a £ [0,1] is a weight capturing the relative importance 
of collected data as opposed to delivered data and 0 < 
Uj(t) < 1, 0 < 9j{t) < 2tt. To simplify notation, we have 
also expressed Ji(Xi,... Xm, t) and JAY\.... Ym, t) as 
Ji(f) and ./ 2 (f). 

Since we are considering a finite time optimization prob¬ 
lem, instability in the queues is not an issue. However, 
stability of such a system can indeed be an issue in the 
sense of guaranteeing that E[Xiit)\ < 00 , E[ZjjiT)\ < 00 
for all i,j under a particular control policy when t —t 00 . 
This problem is considered in [11] for a simpler deterministic 
data harvesting model where target queues are required to be 
bounded. In this paper, we do not explicitly study this issue; 
however, given a certain number of agents, it is possible 
to stabilize a target queue by designing agent trajectories 
to ensure that the queue is visited frequently enough and 
periodically emptied. 


I jit) = log 



III. Optimal Control Solution 


In this section, we address PI in a setting where all data 
arrival processes are deterministic, so that all expectations 
in (fxIi-i THT i degenerate to their arguments. We proceed with 
a standard Hamiltonian analysis leading to a Two Point 
Boundary Value Problem (TPBVP) [19] where the states and 
costates are known at t = 0 and t = T respectively. We 
define a state vector and the associated costate vector: 

X(f) = [X^tX M (t),Y 1 {t),...,Y M (t), 

Zu(t),Z MN (t),sf{t),s\(t),s x N (t), s y N (t)} 
\{t) = [Ai(f),...,AM(f),7i (*)>•• 

0n(f),... ,0Miv(f),?7i(f),r?i(f),. • ■ ,V%(t):V v N (t)\ 

The Hamiltonian is 

H (X, A, u, 0) = 1 [aJi(t) - (1 - a)J 2 (t) + J 3 (f)] 

+ X \i(t)Xi(t) + X^ i(t)Yi(t) + XX^V^vW 

i i i j 


+ X (rf(t)uj ( t ) cos Oj (t) + rf ( t) Uj (t) sin Oj (f)) 


(15) 


where the costate equations are 

= = A ; (T) = 0 

7tW = -|f = i ^ 7» (T) = 0 
'feW = ~~§z~ = ^ ‘feCH = ~qz~ 


. xu , dH 




+ X + XI 


dH 


~~d^~ 


d 


X Aj(f)Xj(f) 


X QgV'yiityY'iit) +X fl e y < l ) ij(t)Zij(t) 


ds y 


(16) 

v x (T)=rj y (T) = 0 

From (fl5l >. after some trigonometric manipulations, we get 
TT(X, A. u, 0) = i [aJr(f) - (1 - a) J 2 (f) + J 3 (f) 

+ X MWi) + X+ X X ■>'3>)*,,(>) 


+ X ^•(f) s gn(t?]'(t))^/ Vj (tf + Vj (tf sm(9j (f) + 0y(f)) 
... (17) 

where tan ipj(t) = for ryj(f) / 0 and 0j(f) = 

sgn(? 7 j (f))f- if Vj{t) — 0- Applying the Pontryagin principle 

to (fl5l) with (u *,0*) being the optimal control, we have: 

iT(X*, A*, u*, 6*) = min if(X,A,u,0) (18) 

u (t),e(t) 

From dT7l i we easily see that we can always make the Uj(t) 
multiplier to be negative, hence, recalling that 0 < Uj(t) < 1, 
u*(t) = 1 (19) 


Following the Hamiltonian definition in (fl5l) we have: 

BH 

of = -Vj (i)uj ( t ) sin 6j (t) + (; t)u, (t) cos 9j (t) (20) 

and setting = 0 the optimal heading 8* (t) should satisfy: 

rfXt) 

tan 9j (t) = (21) 

Vj{t ) 

Since u*(t) = 1, we only need to evaluate 9*{t) for all 
t £ [0,T], This is accomplished by discretizing the problem 
in time and numerically solving a TPBVP with a forward 
integration of the state and a backward integration of the 
costate. Solving this problem quickly becomes intractable as 
the number of agents and targets grows. However, one of the 
insights this analysis provides is that under optimal control 
the data harvesting process operates as a hybrid system with 
discrete states (modes) defined by the dynamics of the flow 
queues in ©. ©, 0, while the agents maintain a fixed 
speed. The events that trigger mode transitions are defined 
in TableQ](the superscript 0 denotes events causing a variable 
to reach a value of zero from above and the superscript + 
denotes events causing a variable to become strictly positive 
from a zero value): 


TABLE I 

Hybrid System Events 


Event Name 

Description 

i-ff 

X{(t) hits 0, for i — 1,... 

M 

2 - 

Xi ( t ) leaves 0, for i = 1,. 

.., M. 

3^ 

Zij(t) hits 0, for i = 1,.. 

, M. j = 1,..., N 

4. ST. 

1-3 

DT (t) leaves 0, for i = 1, 

• ■ •, M, j = 

5. 5°. 

*3 

Dfj (f) hits 0, for i = 1,.. 

,M, j = 1,. ,.,N 

e. a; 

D+. ( t) leaves 0, for j = 1 

...,7V 

~ 

<1 

DT, (t) hits 0, for j = 1,. 

.,7V 


Observe that each of these events causes a change in at 
least one of the state dynamics in ©. ©. ([7}. For example, 
causes a switch in © from Xi(t) — <7i(t) ) 

to Xi(t) = 0. Also note that we have omitted an event (A 
for Z,j (t ) leaving 0 since this event is immediately induced 
by when agent j comes within range of target i and 
starts collecting data causing Zij(t ) to become positive if 
Zij(t) = 0 and A’j(f) > 0. Finally, note that all events above 
are directly observable during the execution of any agent 
trajectory and they do not depend on our model of flow 
queues. For example, if Xi(t) becomes zero, this defines 
event regardless of whether the corresponding queue is 
based on a flow or on discrete data packets; this observation 
is very useful in the sequel. 

The fact that we are dealing with a hybrid dynamic system 
further complicates the solution of a TPBVP. On the other 
hand, it enables us to make use of Infinitesimal Perturbation 
Analysis (IPA) [18] to carry out the parametric trajectory op¬ 
timization process discussed in the next section. In particular, 
we propose a parameterization of agent trajectories allowing 
us to utilize IPA to obtain a gradient of the objective function 
with respect to the trajectory parameters. 























IV. Agent Trajectory Parameterization and 
Optimization 


The idea here is to represent each agent’s trajectory 
through general parametric equations 


Sj(t) = s v j (t)=g{Q j ,p j (t)) ( 22 ) 

where the function pj(t) controls the position of the agent 
on its trajectory at time t and Qj is a vector of parameters 
controlling the shape and location of the agent j trajectory. 
Let 0 = [0i,..., ©at]. We now replace problem PI in ([14} 
by problem P2: 

1 


f T _ 


mm — 
ee f 0 T , 


ctJi(0, f) — (1 — a) J 2 ( 0 , t) + J 3 ( 0 , t) 


dt 


J f (Q,T) 


(23) 

where we return to allowing arbitrary stochastic data arrival 
processes {07 (f)} so that P2 is a parametric stochastic opti¬ 
mization problem with F@ appropriately debited depending 
on (122} . The cost function in ([23} is written as 

J(0, T; X(0,0)) = E[C(G, T; X(0,0))] 
where £(0, T; X(0, 0)) is a sample function debned over 
[0, T] and X(0, 0) is the initial value of the state vector. For 
convenience, in the sequel we will use £ 1 , £ 2 , £ 3 , £/ to 
denote sample functions of Ji, J 2 , J 3 and J/ respectively. 
Note that in d23} we suppress the dependence of the four 
objective function components on the controls u(i) and 
6{t) and stress instead their dependence on the parameter 
vector 0. In the rest of the paper, we will consider two 
families of trajectories motivated by a similar approach 
used in the multi-agent persistent monitoring problem in 
[20]: elliptical trajectories and a Fourier series trajectory 
representation which is more general and better suited for 
non-uniform target topologies. The hybrid dynamics of the 
data harvesting system allow us to apply the theory of IPA 
[18] to obtain on line the gradient of the sample function 
£(0,T;X(0,O)) with respect to 0. The value of the IPA 
approach is twofold: (*) The sample gradient V£(0,T) 
can be obtained on line based on observable sample path 
data only, and (ii) VC(Q,T) is an unbiased estimate of 
VJ(0,T) under mild technical conditions as shown in [18]. 
Therefore, we can use V£(0,T) in a standard gradient- 
based stochastic optimization algorithm 

0 /+1 = 0 ' -zz,V£(0 / ,T), ( = 0,1,... (24) 

to converge (at least locally) to an optimal parameter vector 
0 * with a proper selection of a step-size sequence {v>i} [ 21 ]. 
We emphasize that this process is carried out on line, i.e., 
the gradient is evaluated by observing a trajectory with given 
0 over [0, T] and is iteratively adjusting it until convergence 
is attained. 

1) IPA equations: Based on the events debned earlier, 
we will specify event time derivative and state derivative 
dynamics for each mode of the hybrid system. In this process, 
we will use the IPA notation from [18] so that 77 ,, is the 
kth event time in an observed sample path of the hybrid 
system and r k = X'(t) = ^ are the Jacobian matrices 

of partial derivatives with respect to all components of the 


controllable parameter vector 0. Throughout the analysis we 
will be using (■)' to show such derivatives. We will also use 
fk(t) = ypr to denote the state dynamics in effect over an 
interevent time interval [ 17 -, 77 ,+ 1 ). We review next the three 
fundamental IPA equations from [18] based on which we 
will proceed. First, events may be classibed as exogenous or 
endogenous. An event is exogenous if its occurrence time is 
independent of the parameter 0, hence r fc = 0. Otherwise, an 
endogenous event takes place when a condition gfc(0, X) = 
0 is satisbed, i.e., the state X(t) reaches a switching surface 
described by g*,(0, X). In this case, it is shown in [18] that 

<25. 

as long as ^xfk(i~ k ) ^ 0. It is also shown in [18] that the 
state derivative X'(t) satisbes 

!L X '( yt ) = ^ X '(^ + ( ML : t£[T k ,r k+1 ) (26) 

x '( T k) = X \ T k) + [fk-i i T k) - fk(r£)]T k ' (27) 
Then, X'(t) for t £ [r k , T k +i) is calculated through 

x '(t) = X'(t+) + J j t x '(t)dt (28) 

Table U contains all possible endogenous event types for 
our hybrid system. To these, we add exogenous events K,j, 
i = 1 to allow for possible discontinuities (jumps) 

in the random processes { 07 (f)} which affect the sign of 
07 (f) — p,ijPij(t) in <|4}. We will use the notation e(r k ) to 
denote the event type occurring at t = r k with e[r k ) £ E, 
the event set consisting of all endogenous and exogenous 
events. Finally, we make the following assumption which is 
needed in guaranteeing the unbiasedness of the IPA gradient 
estimates: (A 6 ) Two events occur at the same time w.p. 0 
unless one is directly caused by the other. 


2) Objective Function Gradient: The sample function 
gradient V£(0,T) needed in (124} is obtained from (123} 
assuming a total of K events over [0 T] with t k+1 = T 
and r 0 = 0 : 

1 pT 

V£(0,T;X(0;O))) = -v[/ (a£i(0,t) - (1 - a)C 2 (Q,t) + C 3 (e,t))dt\ 
+ V£/(0,T) 

= r («£!(©>*) -a- 0)^2(0,t) + £ 3 (e,t))dtl 

k=0 Jt >* 


+ V£/(0,T) 

= i fE ( Q ( r v£i(©,t)A + A(0, T k+ 1 y k+ 1 - A(0, T k y k ) 

k =0 Jt fc 

-(!-«)(/ vc 2 (e,t)dt + c 2 (@,Tk+i)T k+1 -£ 2 (,o,T k y) 

J Tk 

+ V£ 3 (B, t)dt + £ 3 (e, T k+1 y+i - £ 3 (0, Tfc)Tfc)] + (0, T ) 

= i ^ 1+1 (aVCdep)^ - (1 - a)VC 2 (0,t)dt + VC 3 (e,t)dtjj 


+ VC f (Q,T) 

(29) 

The last step follows from the continuity of the state 
variables which causes adjacent limit terms in the sum to 
cancel out. Therefore, V£(0,T) does not have any direct 
dependence on any r(; this dependence is indirect through 
the state derivatives involved in the four individual gradient 
terms. Referring to ([ 8 }, the brst term involves V£i(0,t) 




which is as a sum of X[{t) derivatives. Similarly, V£2(0, t) 
is a sum of F/(t) derivatives and V£/(0,T) requires only 
Z' l3 (T). The third term, V£ 3 (0,f), requires derivatives of 
Ij(t ) in (ITIT i which depend on the derivatives of the max 
function in (flOt and the agent state derivatives Sj(t) with 
respect to 0. Possible discontinuities in these derivatives 
occur when any of the last four events in Table Q] takes place. 

In summary, the evaluation of (l29t requires the state 
derivatives X[(t), Z' i j(t), Y?(t), and Sj(t). The latter are 
easily obtained for any specific choice of / and g in <122 b 
and are shown in Appendix Q] The former require a rather 
laborious use of CZU-CETJ which, however, reduces to a 
simple set of state derivative dynamics as shown next. 

Proposition 1: After an event occurrence at t = r k , the 
state derivatives X[(t^ ), Y[{t k ), Z[ 3 (t£), with respect to 
the controllable parameter 0 satisfy the following: 

( 0 if e(T k ) = 


X i( T k) = { X i( T k ) - M il(t)Pil(Tk)T k if e{Tk) = S± 


X 'r(r k ) 

where l / j with P i; (r fc ) > 


dsj 


dsj (c 

~dO \ 


_ dDjj(sj) d Sj f dPjjjsj) 


dsj 


-Si 


otherwise 

0 if such l exists and 

-l 


f Y i( T k ) + Z ij( T k ) if e ( r fc) = C° 
l Y K T k) otherwise 


f 0 if e(rfe) = 

Z'M)=\ Z 'ii(\)+ X ^k) ^ e(r fc ) = C° 

{ z 'ij( T k ) otherwise 

where e(r k ) = occurs when j is connected to target i. 

Proof: See (ED, G3, (|62>, (ZD. <|73>, (65) in 

Appendix [HI] 

This result shows that only three of the events in E can 
actually cause discontinuous changes to the state derivatives. 
Further, note that X[ (t) is reset to zero after a event. 
Moreover, when such an event occurs, note that Z' ;j (t) 
is coupled to X[(t). Similarly for ZP(t) and Y-(t) when 
event occurs, showing that perturbations in 0 can only 
propagate to an adjacent queue when that queue is emptied. 

Proposition 2: The state derivatives X[{tX 1 ), T/jrr 
with respect to the controllable parameter 0 satisfy the 
following after an event occurrence at t = r k : 


x ,(- x / 0 if e(Tfc) = 

* k+1 l X i( T k) ~ Q +1 Vij P iA u ) du otherwise 

r r k +1 

Y l{r k+1 ) = Y({r k ) + / p'fu)du 

J Tk 

where j is such that Pij(t) > 0, t £ [r k ,T k+ 1 ). 

Proof: See (l58l i. (l6ll> and ( 16 31 in Appendix Hill 
Proposition 3: The state derivatives ZP(T k+1 ) with respect 
to the controllable parameter 0 satisfy the following after an 
event occurrence at t = r k : 
i- If j is connected to target i, 

z < (t - ) - f z 'iA T k) 

^A T k+i) 


e(r fe ) = £>, C° or S± 


Z ij ( T k ) + / r ? +1 Pii P lj ( u ) du otherwise 


ii- If j is connected to B with Zij{rk) > 0, 

f T k+l 

z 'ij{r k+1 ) = Z ij{T k ) - / PijP' B j ( u)du 

J Tk 


Hi- Otherwise, Z',(T h , ,) = 

Proof: See (166b . (167b . (l74b and ( 18 1 b in Appendix |III] 

Corollary 1: The state derivatives X[{t), Z' ij {t), Y({t) 
with respect to the controllable parameter 0 are independent 
of the random data arrival processes {(Ti(f)}, i = 1,..., M. 
Proof: Follows directly from the three Propositions. 

There are a few important consequences of these results. 
First, as the Corollary asserts, one can apply IPA regardless 
of the characteristics of the random processes {oi(t)}. This 
robustness property does not mean that these processes do 
not affect the values of the X'(t), ZP(t), Y- {t): this happens 
through the values of the event times r k , k = 1,2 ,..., which 
are observable and enter the computation of these derivatives 
as seen above. Second, the IPA estimation process is event- 
driven: X[(t£), YI(t£), ZC(t£) are evaluated at event 
times and then used as initial conditions for the evaluations 
of X'(r" +1 ), y/(r H i), Z' 3 (Tf +i ) along with the integrals 
appearing in Propositions 2,3 which can also be evaluated 
at t, = T k+ \. Consequently, this approach is scalable in the 
number of events in the system as the number of agents 
and targets increases. Third, despite the elaborate derivations 
in the Appendix, the actual implementation reflected by 
the three Propositions is simple. Finally, returning to ( l29b . 
note that the integrals involving V£i (0 ,f), V£ 2 ( 0 ,f) are 
directly obtained from X'(t), Y((f), the integral involving 
V£ 3 ( 0 ,f) is obtained from straightforward differentiation 
of ( I I 1 b . and the final term is obtained from ZP(T). 

3) Objective Function Optimization: This is carried out 
using d24b with an appropriate step size sequence. 


A. Elliptical Trajectories 


Elliptical trajectories are described by their center coor¬ 
dinates, minor and major axes and orientation. Agent j’s 
position Sj (t) = [s* (t ), Sj (t )] follows the general parametric 
equation of the ellipse: 

Sj (t) = Aj + a,j cos pj (t) cos < j>j — bj sin pj (t) sin <pj 
s V j (t) = Bj + aj cos pj (t) sin (f>j + bj sin pj (t) cos fj 

(30) 


Here, 0 ? = [Aj , B :j . aj, bj, (pj] where Aj,Bj are the coordi¬ 
nates of the center, aj and bj are the major and minor axis 
respectively while (pj £ [0, n) is the ellipse orientation which 
is defined as the angle between the x axis and the major axis 
of the ellipse. The time dependent parameter Pj(t) is the 
eccentric anomaly of the ellipse. Since the agent is moving 
with constant speed of 1 on this trajectory from (IT9] >. we 
have sj(f) 2 + s V j(t ) 2 = 1 which gives 


Pj(t) = 


(a sin pj (t) cos pj + bj cos pj (t) sin pj'j 
+ ( a sin pj (t) sin pj — bj cos pj (t ) cos pj 


(31) 

In the data harvesting problem, trajectories that do not pass 
through the base are inadmissible since there is no delivery 
of data. Therefore, we add a constraint to force the ellipse 
to pass through w B = [wf, wjj] where: 

wf =Aj + aj cos pj (t) cos pj — bj sin pj (t) sin pj 
=Bj + aj cos pj (t) sin pj + bj sin pj (t) cos pj 


( 32 ) 






Using the fact that sin 2 p(t) + cos 2 p(t) = 1 we define a 
quadratic constraint term added to J(0,T;X(0,0)) with a 
sufficiently large multiplier. This can ensure the optimal path 
passes through the base location. We define C :i (0 ,) which 
appears in (l34l i: 


Cj(Qj) = (l - fj cos 2 (j>j - fj sin 2 fa - fj sin 2(j> 3 ) (33) 

, ,1 t w Z-Aj\2 r W B-Bj\ 2 ,2 t w l~ A i\ 2 , 

where fj = (- B s—) + (^6—) . ij = (i-) + 

( w B~ B i \ 2 f 3 _ 

V a,j ) ’ j a/ 6 ^ 

Multiple visits to the base may be needed during the 
mission time [0, T\. We can capture this by allowing an agent 
trajectory to consist of a sequence of admissible ellipses. 
For each agent, we define £ :j as the number of ellipses in 
its trajectory. The parameter vector Oj with k = 1...., £,, 
defines the K th ellipse in agent j’s trajectory and Tf is the 
time that agent j completes ellipse k. Therefore, the location 
of each agent is described through k during [7~ re—1 , T*\ 
where 7J° = 0. Since we cannot optimize over all possible 
£j for all agents, an iterative process needs to be performed 
in order to find the optimal number of segments in each 
agent’s trajectory. At each step, we fix £j and find the 
optimal trajectory with that many segments. The process 
is stopped once the optimal trajectory with £j segments 
is no better than the optimal one with £j — 1 segments 
(obviously, this is not a globally optimal solution). We can 
now formulate the parametric optimization problem P2 e 


where 0, = [0],. .., Qy] and 0 = [0i,..., 0 jv]: 

in J e f \aJi(0,t) - {1 - a)J 2 (®,t) + J 3 (0,t)]dt 

F o -L J o L J 


mm 

©<=F 0 


N 

+ M C ^>(0;)+ ./,(©,T) 

0 =1 

(34) 

where Me is a large multiplier. The evaluation of VC, is 
straightforward and does not depend on any event. (Details 
are shown in Appendix HJ. 


B. Fourier Series Trajectories 

The elliptical trajectories are limited in shape and may 
not be able to cover many targets in a mission space. Thus, 
we next parameterize the trajectories using a Fourier series 
representation of closed curves [22], Using a Fourier series 
function for / and g in (l 22 t , agent j’s trajectory can be 
described as follows with base frequencies and fj: 

u 

s j(t) = «o j + 22 sin(27rn/? : pj(f) + <£*.,•) 

"T 1 (35) 

3 

s V j (*) = b o,j + 22 Vl s H 2 ™fjPj (t) + 

n= 1 

The parameter pit) £ [0, 27r], similar to elliptical trajectories, 
represents the position of the agent along the trajectory. In 
this case, forcing a Fourier series curve to pass through the 
base is easier. For simplicity, we assume a trajectory to start 
at the base and set sj(0) = w B , s?f(0) = w Assuming 
p(0) = 0, with no loss of generality, we can calculate the 


zero frequency terms by means of the remaining parameters: 

r* r y 

3 3 

a o,j = an j sin (^,j)’ b oj = w b~J 2 bn ’j sin (^«j) 


71=1 


71=1 


(36) 


The parameter vector for agent j is 0 7 = 

Iff j a o ,j> • ■ •, ar* ,boj, • • •, b r y,(f> i j,..., j, ■ ■ ■ ,^rv] 

and 0 = [0i,..., 0 jv]. Note that the shape of the curve is 
fully represented by the ratio ff/fj so one of these can 
be kept constant. For the Fourier trajectories, the fact that 
u* = 1 allows us to calculate Pj(t) as follows: 


1 -1/2 


fj a n,jTisin(2irfj pj(t) + 

u 

fj 22 fe njHsm(27r fjpj(t) + <f) y ni 


Pj(t) ' 


(37) 

Problem P2f is the same as P2 but there are no additional 
constraints in this case: 

mm Jf = t Iff ( aJ i (*) “ (! “ + M*)) + Jf{ T ) 

(38) 


V. Numerical Results 

In this section numerical results are presented to illustrate 
our approach. We consider 8 targets, 2 agents and a base 
as shown in Fig. [2] First, we assume deterministic arrival 
process with < 7 * = 0.5 for all i. For Q and ([3} we have used 
p(w , v) = max( 0,1 — D ' w ’ v t ) where r is the corresponding 
value of r.y or r Bj . We have pij = 50 and /3jj = 500 
for all i and j. Other parameters used are a = 0.5, r l:j = 
r Bj = 1, Mi = 1 and T = 100 except for the TPBVP case 
where T = 30. In Fig. [2] results of the TPBVP are shown 
which depend heavily on the initial trajectory and this is 
the best result among several initializations. These results 
are after 10,000 iterations of the TPBVP solver. In Fig. [3] 
the results are shown for the (locally) optimal trajectory 
with two ellipses in each agent’s trajectory (<5j = 2) and 
in Fig. H] for a Fourier series representation with 5 terms 
in (l35l >. Both methods converge in few iterations with each 
iteration taking less than a few seconds. We use the Armijo 
rule to update the step-size in each iteration. The average 
queue length at targets for TPBVP, Ellipse with £j = 2 
and Fourier series are 52.13, 49.23 and 62.03 respectively. 
Whereas The average throughput for the three trajectories is 
3.76, 4.2, 3.56 respectively. Although the example is a very 
symmetric configuration, the benefit of the Fourier series 
trajectories shows when the targets are randomly positioned. 
Then, initializing the TPBVP becomes a very hard task and 
ellipses cannot fit all targets. 

Based on Corollary [T| our results are independent of the 
underlying random processes (eu(f)}. To verify this property, 
we model the exact same problem with a uniform distribution 
for tJi(f) as U[0. 1,0.9], Note that we keep U[cr,(t)] = 0.5, 
the same rate as in the deterministic setting. At each iteration 
we generate a random sample path using the random process 
with <Ti(t ) ~ U[0. 1,0.9], The Fourier series trajectories 
for this stochastic optimization problem are shown in Fig. 












Fig. 2. 8-targets, 2-agents, TPBVP trajectories (T=30) J* = 15.82 Fig. 4. 8-targets, 2-agents, Fourier series trajectories (T=100) J* = 

-50.18 



Fig. 3. 8-targets, 2-agents, Elliptical trajectories (T=100) J* = —50.9 

0 with J* = —48.05 compared to J* = —50.18. The 
objective function converges almost as quickly but with some 
oscillations as expected. 


Fig. 5. 8-targets, 2-agents, Random data processes - Fourier series 
trajectories J* = —48.05 

Appendix I 

Elliptical Trajectories 


VI. Conclusions 

We have developed a new method for trajectory plan¬ 
ning in the data harvesting problem. An optimal control 
formulation provides initial insights for the solution, but it 
is computationally intractable, especially in the case where 
the data generating processes are stochastic. We propose an 
agent trajectory parameterization in terms of general function 
families which are optimized on line through the use of IPA. 
Explicit results are provided for the case of elliptical and 
Fourier series trajectories. We have shown robustness of the 
solution with respect to stochastic data generation processes 
by considering stochastic data arrivals at targets. Natural 
next steps include constraining trajectories to urban setting 
obstacles in the mission space. 


In order to calculate the IPA derivatives we need to 
have the derivative of state variable with respect to all the 
parameter vector 0j = [Aj. Bj, a 7 , bj. (pj] for all agents j. 
These derivatives do not depend on the events happening in 
the system since the trajectories of agents are fixed at each 
iteration. For now we assume £j = 1 for all j = 1 ,N 
hence, we drop the superscript. We have: 


£fl 

daj 


dSj 

dA^ 1 

COS Pj it) COS (f>j , 


£1 

9<t>j 


—cij cos pj (t) sin (j)j 
ds” 

—- = 0 
dAj 1 


OS* 

—- = 0 
dBj 

(39) 

ds x 

— = — sin pj (t) sin 

4>j 

— bj sin pj ( t ) cos 4>j 

(40) 

(41) 

dSj 

dBj =l 

(42) 



















































dsV ds v , 

—X = cos pj (t) sin cj)j , —— = sin pj ( t ) cos (pj (43) 

(J CL j (JU n 

ds v , 

-j- = a.j cos pj ( t ) cos (j)j — bj sin pj (t) sin (j>j (44) 
ocpj 

Also the time derivative of the position state variables are 
calculated as below: 

Sj(t) = —a,jPj(t) sin Pj(t) cos (pj + bjf>j(t) cos pj{t) sin. <f)j 

(45) 

af (t) = —djpj (t) sin pj ( t ) sin <f>j + bjPj ( t ) cos pj (t) cos (pj 

(46) 

The gradient of the last term in the J e in (l34l > needs to be 
calculated separately. We have for j ^ l, = 0 and for 

j = l - 

dA7 = ( — cos 2 4>j -qj- — sin (pj -qj- — sin 2 (pj ) 


dBj ~ ( COS ‘A? < Aj dB | S ^ n ^<Pj OB^ ) 

g- = 2 Cj ( - cos 2 (Pj - sin 2 Oj^f- sin 2^1^) 

S' = 2 Cj ( - cos 2 ^ - sin 2 ^ - sin 2 fa&) 


Off Off- 


dA 


da 


da 


OCj 

d<t>j 

= 2Ci((/j - 

- f 2 

3 3 

) sin 2 (pj — 

2 f] cos 

2 (pj) 


nf W B ~ A j' 

\ 

df] 2 


Bj \ 


H a 2 , 

h 

dBj 

\ b> 

) 


o ( ( W B — A 3 

)2 ) 

df] 

nf {wV-Bj)\ 


' «? 

J 

’ db, 


63 ) 

J 

(w x — Aj 

of B 3 


df] 2| 

( wV b~ B 3\ 

i 

-l 6J 

)’ 

dB, 

^ a j 


i 

9 ((K- b . 

V a:? 


II 

IN-^I ^ 

CO 

2 (K' 

~A,)\ 
b 3 J 


df] ( 


-a]){ w y- 

-Bj)\ 



dAj V 



J 



df] ( 

Vi 

- a)){w x - 




dBj V 



J 



df] _ 2 f 

K 

: -Aj)( w y 

~ Bj)\ 



da, V 



) 



df] o/«- 

~ A j)( w b - 

Bj)\ 



dbj " V 


6 ? 

) 



daop 


db 0 , 


ds x ds x 

3 =sin(27 TnfjPj(t)+(p*j), = 0 (48) 


da. 


n,J 


db n 


71, J 

ds x - 0s x 
—— = a n j cos(2t mff pj{t) + 4>n,j) = 0 ( 49 ) 


d(p: 


n,j 


d(p ; 


V 

n,3 


dfj 


= 2tt pj(t) Y a n jncos(2nnfjPj(t) + (p x n j), (50) 

(51) 


dsf dsf 

3 = 1, ——~— = 0 


db 0j ’ da 0 j 
ds v ds v 

-A- = sin(2 nnfjpjit) + $ ■). —A- = 0 (52) 

UUn,j i-'U j n j 

ds v . ds v 

—h = Kj cos^Tm/J Pj (t) + (p v n g) = 0 (53) 


dcp ; 


ds« 

—- = 0 

df? 


(54) 


Also the time derivative of the position state variables are 
calculated as below: 


= Pj{t ) Y 2tt nf x a nJ cos(2 nnff Pj(t)+<p*j), (55) 

n= 1 
r “ 

8j(t) = Pj(t) Y 27rn fj a n,j cos(2t TnfJpj(t) + (p x j), (56) 

n— 1 

Appendix III 

IPA EVENTS AND DERIVATIVES 

In this section, we derive all event time derivatives and 
state derivatives with respect to the controllable parameter 
0 for each event by applying the IPA equations. 

1. Event This event causes a transition from X, (t) > 
0, t < Tk to Xj(t) = 0, t > Tfc. The switching function is 
<7fc(0,X) = X, so = 1. From (l25l > and 0: 


(dgk . , - P\~ 1 (dg k dg k , _ \ 
Tk V dXj fk ^ Tfc V \dQ + dX, Xi ^ Tfc V 


X'(rX) 


(57) 


(Ti{Tk) - PijPij{Tk ) 

where agent j is the one connected to i at t = T k and we 
have used the assumption that two events occur at the same 
time w.p. 0, hence cr, : (r,7) = crj(Tfc). From (|26 |i-(| 27|>. since 
Xj(t) = 0, for r k <t < r k+1 : 


Appendix II 

Fourier Series Trajectories 

We calculate the position of agent j’s 

derivative with respect to all the Fourier 

parameters. The parameter vector is 0 7 = 

[fj,aoj, ■ ■ ■ ,arpboj ,..., b r y,(pi }j ,..., (prptpij, ■ ■ ■ >£r»]- 
So we have: 

ds? ds x 

3 = 1, = 0 (47) 


i^k ) = X 'i( T k ) + - PijPi; 

X i( T k )\ (T i{ T k) — PijPij 

= X'(Tk)~- V 


(58) 


X. 


- 0 


r k 


ai (t/j; ) fj/.j T\j (n, ) 


= 0 


(59) 

For X r (t), r ^ i, the dynamics of X r (t) in (|4]i are unaffected 
and we have: 

X' r (r+) = X' r (T^) (60) 

If X r (r k ) > 0 and agent l is connected to it, then 


dt Xr ^ dX r (t) X ' r ® + X>r ^ 

= ~ PrlPrl(T k 2j = -p r lPPi(t) 


(61) 


and if X r (t) = 0 in [r k , T k+ \] or if no agents are connected 
to i, then and ^ X' r {t ) = 0. 

For Y r (t), r = 1,... ,M, the dynamics of Y r (t) in 0 are 


































not affected by the event £? at rfc, hence 

= Y;{t~) (62) 

and since Y r (f) = (3 r (t), for T k < t < T k +i : 

^(t) = + ^ {t) = ^ (63) 

For Zij(t), we must have Ay(rfc) > 0 since Xj(Tj~) > 0, 
hence > 0 and from (127b : 

z 'ij ( T k ) = Z'ij ( T k ) + [Zij (r k ) - Zij{T +)] t[ 


= Z 'ij( T k ) + hi ( T k ) - P-ij( T k) P ij( T k) T k 


(64) 


Since Xi(r k ) > 0, from © we have jj,ij(r k ) = n%j- 
At r k , j remains connected to target i with fiij(T k ) = 
a i( T k) / p ij( T k) = a ’i( T k)/Pij(Tk) and we get 

^ ~ X i( T k) Vij p ij(T k )-cri{T k ) 

ZUr+) = Z'ij( T k ) + ~ 


^ 3Kk> ipkJ ‘ crj(rfe) — HijPij^Tk) 

= Z[ 3 {t~) + X[{r~) 


dZ^t) 

dO 


(65) 


(jj'ij(t) P ij{t) Pij P Bj(t)') 


( 66 ) 


From ( l26l ) for r k < t < T k+ j: 

, dZa(t) rr . , . 

Jt Zij ^ = dZij(t) Zi ^ 

_ dZij(t ) _ 8 

~ do ~ do 

Since p,ij(t) = a, (f) /Pi 3 (t) for the agent which re¬ 
mains connected to target i after this event, it follows 
that -jY [jj-ij(t)P,, j (£)] = 0. Moreover, P Bj (t ) = 0 by our 
assumption that agents cannot be within range of the base 
and targets at the same time and we get 

j t Z[ 3 {t) = 0 (67) 

Otherwise, for r ^ j , we have /r,>(f) = 0 and we get: 

±Z'. r {t) = -p ir P^{t) (68) 

Finally, for Z rj (t), r/iwe have ZY(r k ) — Z' rj (T k ). If 
Z, ' 

8 et dt^rj 


rj (t) = 0 in [Tfc,rfe + 1), then JjAA(f) = 0. Otherwise, we 


d - t Z' r Xt) from 


dt rj\ 

with i replaced by r. 


2. Event £4: This event causes a transition from X, (t) = 
0 , t < Tk to Xi{t) > 0, f > Tfc. Note that this transition can 
occur as an exogenous event when an empty queue X, (t) 
gets a new arrival in which case we simply have r k = 0 
since the exogenous event is independent of the controllable 
parameters. In the endogenous case, however, we have the 
switching function g k (0,X.) = <Ji(t) — g,ijPij(t ) in which 
agent j is connected to target i at t = tv Assuming = 

& Sr] T and k o = [«* from 

/_ / dgk dsj \ / dgk . A" 1 

Tk v dsj do)\d Sj (69) 

At Tfc we have cr, (r k ) = HijPij(Tk). Therefore from (f27l) : 
X'(r+) = X'(r-) + [Xi(r k ) - X i (r+)]r fc ' 

= ) + (o - o-i(rfc) + (7,))Tfc' = x'(r fc “) 

(70) 

Having Xj(f) > 0 in [r fc ,r fc+ 1 ) we know Xj(f) = cr;(f) - 
H, : j p ij (f) therefor, we can get -4 A'(f) from (|6H with r and 


l replaced by i and j. For X r (t), r 7^ i, if X r {jk) > 0 and 
agent l is connected to r then X r (rk) = <J r {Tk) — HriPriijk), 
therefor, we get X' r {T k ) from (l60l > while in [7^, 7^+1) we 
have ^A'(f) from (|6T1 >. If X r (rk) = 0 or if no agent is 
connected to target r, X r (rk) = 0. Thus, X' r (r k ) = X' r (r k ) 
and £ t X' r (t) = 0. 

For Y r (t), r = 1 ,M the dynamics of Y r (t) in © are 
not affected by the event at hence, we can get Y^(r k ) 
and ^F/(f) in [rfc,r fc+1 ) from (l62l > and (|631 > respectively. 
For Zij(t ) assuming agent j is the one connected to target 
i, we have: 


= Z[At7) + Zij( tZ ) - 4-(r+) 




= Z 'ij{T k ) + pii(Tfc ) - hi( T k)\ P ij( T kK = Z' l3 {T k ) 

(71) 

In the above equation, ^ij{r k ) = /i; 7 because A* (7)7) > 0. 
Also, HijPij{Tk) = crj(r fc ) and /Ety(r^) = results 

in idij{r k ) = Hij. For Zu(t), l ^ j , agent l cannot be 
connected to target i at so we have, Z' u (r k ) = Z' a (r k ) 
and )§ Z' a (t ) = 0 in [r fc ,r fe+ i). For Z ri {t) ,r / i and l ± j 
using the assumption that two events occur at the same time 
w.p. 0, the dynamics of Z r i (f) are not affected at Tk, hence 
we get 4: Z' rl (t) from d66l > for i and j replaced by r and l. 

3. Event (7° : This event causes a transition from Ay (f) > 
0 for f < Tk to Zij(t) = 0 for f > r*. The switching function 
is <7fc(0,X) = Zij(t) so = 1. From 


T/c 


\dZi 


\dO dZ, 


-z^ 


z[A T k ) 


^-( T fc) 

i- l ij( T k ) P ij( T k ) ~ fiij P Bj{ T k ) Pij P Bj( T k) 


(72) 


Since Ay (t) is being emptied at r*, by the assumption that 
agents can not be in range with the base and targets at the 
same time, we have Pij(jk) = 0. Then from d27l >: 

z ii i T k ) = Z^ (r k ) + [ - PijP BJ (r k ) - oj T k ' 


= Z ij (t k ) - [0ij p Bj (n) 

Since Z i3 (t ) = 0 in [r fc ,r fc+ 1): 


^■( T fc ) 

%3 P Bj ( T k ) 


= 0 


4 am = +^4#=0 


(73) 


(74) 


dZij(t) 13 v ' a© 

For Z r i(t), r 7 * or ( 7 j, the dynamics in © are not 
affected at Tfc, hence: 

Z'rl(Tk) = Z' rl (T k ) (75) 

if Z r i(Tk) > 0, the value for jjjZ' rl (t) is calculated by (l66t 
with r and l replacing i and j respectively. If Z r i(Tk) = 0 
then f t Z' rl (t)=0. 

For 14(f) we have /3i( T k ) = 0 since the agent has emptied 
its queue, hence: 


YI(t+) = YI(t-)+ 14(r fc -)-i4(r+) 


= Yl(T k ) + \p ij p BJ (T k )- 0] 

= l7(T,-) + A',(T-r) 


Z[j(Tk) 

%j p Bj ( T k ) 


( 76 ) 
































In [r k ,T k+ 1 ) we can get ^Y({t) = 0 . For Y r (t), r ^ i the 
dynamics of Y r (t) in E} are not affected by the event at 
r k hence, Y'(r k ) and ^jY'.(t) in [T k .T k +\ ) are calculated 
from (l6l and tf63l) respectively. The dynamics of X r (t), 
r = 1,... ,M is are not affected at T k since the event at 
Tfc is happening at the base. We have X' r (r k ) = X^(t/7). 
If X r (r k ) > 0 then we have j^X' r {t) from (IfTTb and if 
X r (T k ) = 0 then ^X' r (t) = 0 in [r kl T k+1 ). 


4. Event St ; : This event causes a transition from DfAt) = 
0 for t < r k to Dfj(t) > 0 for to t > r k . It is the moment 
that agent j leaves target i’s range. The switching function 
is 5fc(0, X) = Dij(t) - Vij , from El : 


Tk = - 


dDa 


dDij dsj _ ; 

dsj 90 V dsj 


■Sj\T k 


(77) 


If agent j was connected to target i at r k then by leaving the 
target, it is possible that another agent l which is within range 
with target i connects to that target. This means X,;(t9") = 


ai{Tk) ~HuPu{Tk) and Xi(r k ) = crj(r fe ) - HijPi 3 (r k ), with 
Pj.j (r k ) = 0, from (12Tb we have 

X'(r+) = X'(r k 7) - MiiPuCnH (78) 

If Xi{r k ) > 0, in [r k ,T k+ 1) is as in (|6TJ with r 

replaced by i and if Xi(r k ) = 0 then = 0. On the 

other hand, if agent j was not connected to target i at r k , we 
know that some l A j is already connected to target i. This 
means agent j leaving target i cannot affect the dynamics 
of Xi(t) so we have X'(r+) = X'(r“) and j- t X[(t) is 
calculated from ( 151 with r replaced by i. 

For X r (t), r/i the dynamics in © are not affected by the 
event at r k hence, we get X'.' r ( T k ) from (l60b . If X r (r k ) > 0 
the time derivative in [r k ,T k+ 1) can be calculated 

from (fOTb and if X r (r k ) = 0 then ^A'(f) = 0. 

For Y r (t), r = 1,...,, M, the dynamics in (O are not also 
affected by the event at r k hence, we get Y r (r k ) from ( 162b 
and in [r k ,T k+ 1) the 4fY r '(t) is calculated from (l63b . 

For the dynamics in © are not affect at T k , regardless 

of the fact that agent j is connected to target i or not. We have 


Zij(r k ) = fiij(T k )Pij{T k ) with Pij(r k ) = 0 and Z i:j [T k ) = 


0, hence from ( 1271 : 


Z 'iA T k) = Z lj( T k ) + Z v( T k ) - z iA T k) 


(79) 


= z ij ( T k ) + Aij (■ n)Pij (r k y k = Z'ij(r k ) 
and in [r fc ,r fe+ i) , we have j^Z[.{t) = 0 using ( |66b 
knowing P. l3 (r k ) = P Bj (r k ) = 0. For Z H (t), r^ioil^j, 
the dynamics of Z r i(t) are not affected at r k hence d75b 
holds and in [r k ,T k+ 1) again we can use d66b with i and j 
replaced by r and l. 


5. Event : This event causes a transition from 
> 0 for t < r k to D^(t) = 0 for to t > r k . The 
event is the moment that agent j enters target i’s range. 
The switching function is g k (Q,X.) = Dij(t ) — . From 

(El we can get r k from (1771 ). If no other agent is already 
connected to target i, agent j connects to it. Otherwise, if 
another agent is already connected to target i, no connection 
is established. For X,(f), the dynamics in © are not 


affected in both cases, hence, EHb holds. If Xi(t) > 0 in 
[Tfc,Tfc + i) we calculate jjjX'(t) using (16Tb with l being the 
appropriate connected agent to target i. If X; () = 0, 
■jjX'(t) = 0. For X r (t), r A i the dynamics in (|1 are not 
affected by the event at r k . Hence, we get X^(r k ) from 
(f60t . If X r (r k ) > 0 we calculate ^ X' r (t ) from (fOTt with i 
replaced by r and if X r (r k ) = 0 then jjjX' r (t) = 0. 

For Y r (t), r = 1,..., M again the dynamics in (Q are not 
affected at tau k so both ( 1621 and (l63b hold. 

For Zij (t), with agent j being connected or not to target i at 
r k the dynamics of Zij(t) are unaffected at r k , hence (171 
holds for i and j and in [r k , T k + 1) the is calculated 

through (1661 . For Z r i(t), r A * or ^ 7^ J the dynamics are 
unaffected d71 holds again. In [T k ,T k+ 1), ^ Z' rl (t ) is given 
through (l6l with i and j replaced by r and l. 


6. Event At: This event causes a transition from 
Dgj(t) = 0 for t < r k to Dgj(t) > 0 for t > r k . The 
switching function is g k (Q, X) = D B . ( t ) — r Bj . 

dD B3 ds ]( dD B 3 \ -i 

Tk 9s, 90 V 


ds 3 


Bi Si 


(80) 


Similar to the previous event, the dynamics of Xi(t) are 
unaffected at r k hence, we have X\{r k ) calculated from 
(f70b . If Xi(t) > 0 in [r k ,T k+ i) we calculate through 

(ED and if Xi(T k ) = 0, f t X[(t) = 0. 

For Y r (t), r = 1,...,, M. the dynamics of Y r (t) in Eb are 
not affected at r k , hence, we get Y r (r k ) from (l6l and in 
[T k ,T k + 1), -jjY;(t) is calculated from (l6l . 

For Zij(t), Using the fact that agent j can only be connected 
to one target or the base, we have Z i3 (t a 7 ) = /3i 3 (r k )P Bj (r k ) 
with P Bj (r k ) = 0 and Zi 3 {r k ) = 0, hence (171 holds with i 
and j replacing r and l. In [T kl T k + 1) from 


^ V' ff) _ d Z ij(t) ry! . 9Zij{t) 

~ Z ijY> ~ ov u\ A ijY) + 


dt v y ~' dZjj (t) 
9 Zij (t) 


90 


(81) 


= -PhKM 


90 J BjK 

As for Z r i(t), r A i or l A J th e dynamics are unaffected so 
(171 holds. In [ T kl T k+ \ ) we can calculate jjjZ' rl (t) through 
with j replacing l. 


7. Event A 1 -: This event causes a transition from D n/ (t) > 
0 for t < Tfc to Dgj(t) = 0 for t > r k . The switching 
function is g k (Q,X.) = D Bj (t ) — r Bj . Using El we can 
get r k from Similar with the previous event we have 
X'(r k ) from (ITOb . If Xi{t) > 0 we can get ^X[(t) from 
(l6lb and if X,(r,7) = 0 then ^A-(f) = 0. 

For Y r (t), r = 1 M, we again follow the previous 
event analysis so d62t and (IDb hold. 

For Zi 3 {t), the analysis is similar to event A^ so we can 
calculate Z,6(r A t) and in [r fc ,r fc+ i) from EB and 

(l6l respectively. Also for Z r i(t), r / t or 1 ^ j, (171 
holds with same reasoning as previous event. In [T kl T k + i ) 
we calculate from (l66b . 
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